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1.  Introduction  and  Summary 


In  recent  years  computer  simulation  has  become  a very  important 
tool  for  analyzing  the  behavior  of  stochastic  processes.  As  the  structures 
of  widely  used  processes  become  increasingly  complex,  analytic  results 
become  more  difficult  to  obtain.  Frequently  simulation  is  the  only  com- 
putationalLy  feasible  method  to  study  a process. 

Unfortunately,  simulation  can  be  a very  expensive  tool  to  use.  It 
is  therefore  desirable  to  develop  methods  that  can  reduce  the  run  lengths 
(and  hence  cost)  of  a simulation,  yet  still  give  accurate  estimates.  Such 
methods  are  called  variance  reduction  techniques.  This  paper  will  propose 
and  test  a new  variance  reduction  technique  in  the  special  case  when  the 
stochastic  process  being  simulated  is  a Markov  process.  A subsequent  paper 
will  describe  several  other  related  techniques  applicable  when  the  iMarkov 
process  has  a finite  state  space. 

As  an  example  of  how  expensive  simulations  can  be,  consider  estimating 
via  simulation  E(W),  the  expected  stationary  waiting  time  in  an  M/M/1 
queue.  The  M/'M/ 1 queue  is  not  something  that  one  would  ordinarily 
simulate  since  analytic  results  for  it  are  readily  available.  However, 
despite  its  simplicity  the  M/M/ 1 queue  can  be  a very  difficult  and 
expensive  process  to  simulate.  It  is  therefore  a good  candidate  for  testing 
simulation  methodologies.  Let  X be  the  rate  of  the  Poisson  arrival  process 
and  let  |a”^  be  the  mean  of  the  exponentially  distributed  service  times. 
Define  the  usual  traffic  intensity  p = Now  let  be  the  waiting 


(1971)),  then 

Vn(W  - E(W)) 

(1.1)  =^N(0,1)  as  N -♦« 

for  some  constant  a (0  < o < «).  Here  ^ denotes  weak  convergence 
or  convergence  in  distribution  (see  Billingsley  (I968))  and  N(0,1)  is 
a normally  distributed  random  variable  with  mean  0 and  variance  1. 

From  (1.1)  we  can  form  the  following  approximate  100  X (l-a)56  confidence 
interval  for  E(W) 

(1.2)  (t\  - Zq^2  V2 

where^  for  a given  is  defined  by  P{N(0, 1)  > Zq^2^  " 

Suppose  we  want  to  run  our  simulation  until  the  half  length  of  our 
confidence  interval  is  some  prespecified  fraction  of  E(W) ; i.e.^  we  want 
^q/2  ” 8E(W)  for  some  6.  For  example  if  we  (somewhat  arbitrarily) 

pick  a = .1  and  S = .1  we  then  seek  to  form  a 90’]^  confidence  interval 
whose  half  length  is  10^  of  the  quantity  we  are  interested  in  estimating. 

The  number  of  customers^  that  need  to  be  simulated  to  obtain  this  level 
of  accuracy  is  given  in  Table  1.  It  is  seen  that  as  p increases  (beyond 
p = .5)  the  required  run  lengths  increase  rapidly  until  for  large  values 

2 
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of  p one  must  simulate  such  an  enormous  number  of  customers  to  obtain 
"decent"  estimates  that  simulation  is  no  longer  a feasible  alternative. 
It  is  run  lengths  such  as  these  that  variance  reduction  techniques  are 
designed  to  cut  down. 


TABLE  1 

SAMPLE  SIZES  FOR  THE  l^l^l  QUEUE 

N = Number  of  customers  that  must  be  simulated  for  a 90^ 
confidence  interval  for  E(W)  to  have  a half  length  of  . 10  E(W) 


(P  = 1. 

.0,  A = p,  E(W) 

A . 
p(p-^) 

p 

E(W) 

2 

a 

N 

.10 

.111 

8,200 

.20 

.250 

1.39 

6,020 

.30 

.429 

3.96 

5,830 

.40 

.667 

10.6 

6,430 

.50 

1.00 

29.0 

7,850 

.60 

1.50 

88.5 

10,600 

.70 

2.33 

335 

16,700 

.80 

4.00 

1,976 

33,400 

.90 

9.00 

35,901 

119,000 

.95 

19.0 

607,600 

455,000 

.99 

99.0 

3.95  X 10® 

1.09  X 10^ 
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One  of  Che  most  effective  variance  reduction  techniques  is  that  of 
control  variables.  A good  introduction  to  this  technique  is  given  in  Che 


L 


book  by  Gaver  and  Thompson  ( 1973) • Recent  studies  involving  control 
variables  may  be  found  in  Gaver  and  Shedler  (I97I),  Iglehart  and  Lewis 
(1976),  Lavenberg  (l97k),  Lavenberg^  Moeller  and  Sauer  (1977),  Lavenberg^ 
Moeller  and  Welch  (1977),  and  Lavenberg  and  Shedler  (1975)-  Since  the 
technique  about  to  be  proposed  is  closely  related  Co  this  method^  we  now 
give  a brief  outline  of  control  variables  before  proceeding.  Let 
(X  , n > 0)  be  a sequence  of  i.i.d.  (independent  and  identically  distri- 


buced)  random  variables  with  unknown  mean  r = E(X  ).  We  shall  be  interested 


2 2 

in  estimating  r via  simulation.  Let  a = a (X  ),  the  variance  of  X 

N ” 


We  can  estimate  r by  ^ = Z X /N  and  can  form  a confidence  interval 

n=l  ” 


for  r using  the  central  limit  theorem 


(1.3) 


VN(Vr) 


N(0, 1) 


as  N -*  00 


Suppose  that  we  now  have  random  variables  n > 0)  such  that  the 

are  i.i.d.  X and  C are  correlated  (usually  achieved  by  simulating 
' n n 


X and  C with  the  same  stream  of  random  numbers)  and  for  which 
n n 


(l.i^) 


r = E(C  ) 
c n' 


is  known.  Let  p be  some  constant  and  set  Z^O)  = X^  + P(C^-r^).  Then 


(Z  O)  n > 0)  are  i.i.d.  with  mean  r and  variance  which  will  be 


N 


denoted  by  o O) . We  let  Z^^O)  = E ZJP)/n.  We  have  by  the  strong 

n=l 


law  of  large  numbers 


(1.5) 


a.s.  (almost  surely)  as  N -* » 


and  by  the  central  limit  theorem; 


(1.6) 


VN(y^)  - 0 


=»N(0,l) 


as  N -» • 


We  now  pick  ^ s to  minimize  the  variance  term  o (0).  It  is  easy  to 
show  that 


(1.7) 


(1.8) 


P*  = -cov(X^,  Cj/o^{Cj  , 


= (1  - p^(y  y)  <^x 


where  o(X  , C ) is  the  coefficient  of  correlation  between  X and  C . 

Ti*  n'  n n 

2 

Since  0<p(X,C)<l  a reduction  in  variance  has  been  obtained  and 
— ' n^  n'  — 

we  are  thus  able  to  form  shorter  confidence  intervals  for  r.  C is 

n 

called  a control  variable  for  X . The  method  can  be  extended  to  allow 

n 

multiple  controls  (see  Lavenberg,  Moeller  and  Sauer  (1977)). 

The  key  things  to  observe  are  that  r^  = E(C^)  must  be  known  and 
that  X^  and  must  be  highly  correlated  to  get  large  variance  reduc- 

tions. It  is  often  very  difficult  to  come  up  with  good  controls,  particularly 
if  the  stochastic  process  being  simulated  is  very  complicated.  The  method 
to  be  proposed  in  this  paper  circumvents  this  difficulty  by  devising  con- 
trols which  will  usually  be  highly  correlated  with  the  process  of  interest 
and  for  which  the  means  of  the  controls  need  not  be  explicitly  known.  This 
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Is  because  the  controls  are  chosen  in  such  a way  that  their  means  actually 
equal  the  parameter  of  interest  ( r) . 


We  shall  initially  restrict  ourselves  to  studying  controls  for 
functionals  of  the  stationary  distribution  of  regenerative  discrete  time 
Markov  processes.  Examples  of  such  processes  are  positive  recurrent  Harkov 
chains  and  the  server  workload  process  in  multiple  server  queues  in  light 
traffic  (from  which  the  waiting  time  process  can  be  derived).  In  Section  2 
we  state  results  for  such  processes.  Section  3 contains  a description  and 
discussion  of  the  variance  reduction  technique.  We  also  show  in  Section  3 
how  the  method  can  be  extended  to  different  types  of  stochastic  processes 
including  continuous  time  Markov  chains^  semi-Markov  processes  and  certain 
types  of  stationary  processes  (\#hich  may  be  neither  regenerative  nor 
Markovian).  Numerical  examples,  taken  from  queueing  theory,  demonstrating 
the  effectiveness  of  the  method  are  presented  in  Section  k. 


2.  Regenerative  Markov  Processes 


In  this  section  we  introduce  notation  and  state  some  preliminary 
results  for  discrete  time  regenerative  Markov  processes.  These  results 
will  be  useful  in  developing  variance  reduction  techniques  for  this  class 
of  stochastic  processes.  The  notation  and  definitions  in  the  early  portion 
of  this  section  follow  Orey  (I97I)  fairly  closely. 

Let  (E^g.)  he  a measurable  space;  i.e.,  let  g be  a sigma  algebra 
of  subsets  of  o.  Let  IR  denote  the  real  numbers. 

(2.1)  DEFINITION.  A function  P:(E^6)  -*11  is  said  to  be  a probability 
transition  function  if  it  satisfies 

(a)  P(x,*)  is  a probability  measure  on  (E^g)  for  all  x € E. 

(b)  P(*^B)  is  a measurable  function  with  respect  to  £ for  all  B € £. 


Let  E*  = E X E X • • • and  let  be  the  product  sigma  field 

e X 6 X •••  . For  any  cu  = (03^,  ...)  £ e"  let  X^(a))  = o)^  for  each 

n > 0.  Let  la  be  some  probability  measure  on  {E^g),  called  the  initial 

probability  distribution.  There  then  exists  a probability  measure  P 

on  (e“  &“)  such  that  for  all  n > 0 and  B.,  ....  B £ £ 

' 7 / — O'  ' n 


(2.2) 


Vo  ^ V "'l  ^ ®1>  ""n  ^ V 


®0  ®1  ®n 
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If  Y is  any  random  variable  let  ■ I region 

of  integration  is  omitted  it  is  assumed  to  be  the  whole  space  C) . If  Y 

is  an  indicator  function;  i.e.,  if  Y(ou)  > l^gj(u>)  (■  1 if  u)  € B and 

0 if  coj^B)  let  P(B)3E(1,,).  If  for  some  x c E we  have 

I *>  J 

^([x}]  = 1 we  then  write  C for  E . 

X 4 


(2.3)  DEFINITION.  The  n-step  probability  transition  functions  p"  are 
defined  by 

(a)  P°(x,B)  = 

(b)  p\x,B)  = P(x,B), 

(c)  P^Cx^B)  =/  P"‘\x,  dy)  P(y,B),  for  n > 2. 

It  can  then  be  shown  that 

(2.4)  P^{X^  € b|Xq,  Xj^)  = P"‘‘'(Xj^,B)  a.s. 

Equation  (2.4)  is  known  as  the  Markov  property  and  (X^^  n > 0)  is  said 
to  be  a Markov  process  with  state  space  E^  initial  distribution  and 

stationary  transition  function  P. 

(2.5)  DEFINITION.  Let  x be  any  probability  measure  on  (E^g).  The 
probability  measure  xP  on  (E^6)  is  defined  by 

xP(B)  = / x(dx)  P(x,B)  , for  all  BEE  . 
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[ 


i 


I 

I 

! 

i 


I 
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(2.6)  DEFINITION.  For  any  n > 0 and  any  function  £:E  -*01  such  that 
/ dy)  |f(y)!  < eo  define  the  function  P^frE  ~»TR  by 

P"f(x)  = / P"(x,  dy)  f(y)  . 

If  the  state  space  E is  countable  we  can  label  the  states  by  the 
nonnegative  integers  (0,  1,  2^  ...).  Letting  P^j  ■ - jlx^  ■ l) 

and  = P(Xq  = i]  we  have  (X^,  n > 0)  being  a Markov  chain  with 
transition  matrix  P and  initial  distribution  u.  The  integrals  in  (2.2) 
to  (2.6)  can  be  replaced  by  sums;  e.g.^  if  x = 1 and  B > (k)  Chen  (2.3.c) 
becomes 


z 

j=o 


which  is  a special  case  of  Che  familiar  Chapman- Kolmogorov  equations  for 
Markov  chains.  In  the  countable  case  Definition  (2.6)  becomes 

P"  f(i)  = Z P"  f{j)  . 

j=0  ^ 

The  reader  is  referred  to  the  books  by  Chung  (I967)  ^nd  Karlin  and  Taylor 
( 1975)  for  a more  detailed  analysis  of  Markov  chains. 

In  order  Co  develop  results  which  will  be  useful  in  simulation  we 
make  the  additional  hypothesis  that  the  Markov  process  is  regenerative 
(see  IJinlar  (I975)  or  Crane  and  Iglehart  (I975));  that  is  we  assume  that 
there  exists  some  x^  € E such  that  with  probability  1^  X^  = x^  infinitely 
often.  Let  T denote  the  mth  time  the  process  enters  the  state  x . 


L 
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(2.7)  DEFINITION.  For  m > 0 T Is  defined  (recursively)  by 

(^) 

(b)  T = inf(n  > T ,:X  * x.)  for  m > 1. 

jjj  in- 1 n V * 

Notice  chat  is  a stopping  time  (see  Cinlar  (I97?)).  Because 

X = X.  infinitely  often.  T < as  for  each  m and  T increases  to  os 
n 0 ' m n 

as  m -*  00.  Let  t = T - T , for  m > 1.  For  each  m > I denote  the 

m m ID- 1 • “ 

random  vector  V by 


(2.8) 


(*T  / 

m-  1 


T ,+l> 
m- 1 


T -1» 
in 


(2.9)  PROPOSITION.  If  Xq  = Xp  a.s.  then  m > 1}  is  a sequence 

of  i.i.d.  random  vectors.  If  P(X-  = x.^)  < 1 then  {V  . m > 1)  is  a 

sequence  of  independent  random  vectors  although  Che  distribution  of  V 

***  1 

may  be  different  than  the  (common)  distribution  of  fV  . m > 2). 


PROOF.  See  Proposition  1 of  Crane  and  Iglehart  (1974b).  □ 


The  process  is  said  to  be  in  the  mth  cycle  between  tiroes  T , and 

in-  I 

T - 1,  and  t is  called  the  length  of  Che  mth  cycle.  The  fact  that 
m ^ tn 

fV  , m > 1)  are  i.i.d.  (assuming  X.  = x.  a.s.)  just  means  that  the 

— u u 

behavior  of  the  process  over  a cycle  has  the  same  distribution  and  is 
independent  of  the  behavior  of  Che  process  over  any  other  cycle.  The 
importance  of  this  in  Che  context  of  simulation  is  that  the  simulation 
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run  can  then  be  broken  up  into  randomly  spaced  i.i.d.  blocks  so  that  the 
techniques  of  classical  statistics  can  be  used  to  analyze  the  output  (see 
Crane  and  Iglehart  (I975)). 

We  now  make  the  additional  assumptions  that 


(2.9) 


(2.10) 


P (Tj  < 00)  » 1 ^ for  all  y € E 


(t.)  < » 


(2.11)  The  (common)  distribution  of  (t  , m > 2)  is  aperiodic 

m'  — 


( see  Cinlar  ( I975))  . 


Under  these  conditions  the  following  proposition  (whose  proof  may  be  found 
in  either  ^inlar  (1975)  or  Crane  and  Iglehart  (1975))  is  true. 


(2.12)  PROPOSITION.  There  exists  a random  element  X and  a probability 
measure  tt  on  (E^S)  with  P(X  € A)  = Tr(A)  for  all  A € £ such  that 

(a)  X, 

(b)  p"(y,A)  -♦Tr(A)  for  all  A £ £ and  y £ E, 

(c)  - satisfies  irP  = i.e.,  7r(A)  = / T(dy)  P(y,A)  for  all  A £ e. 


The  probability  measure  tt  is  called  the  stationary  distribution 
of  the  process.  In  the  countable  case  equation  (2.12.c)  becomes  (for 
A = (j)) 


11 


(2.15) 


00 


ir  = E TT.  P. , 


Now  let  f:E  -»]R  and  assune  that  Tr|f|  = / Tr(dy)  lf(y)|  < «, 

One  is  often  interested  in  knowing  Trf  = E(f(X))  for  a variety  of 
functions  f.  For  example  if  E = R and  f(x)  = then 

irf  = P{X  < y)  and  if  f(x)  = x**  then  TTf  = E(X*^) . Since  the  system  of 
equations  defined  in  (2. 12.c)  (or  (2.15))  very  difficult  to  solve 

(for  example  if  E is  countably  infinite  or  if  E is  finite  but  very 
large)  it  may  become  necessary  to  estimate  Trf  via  simulation.  It  is 
the  efficient  estimation  of  such  quantities  that  we  will  concern  ourselves 


Assume  now  that  X^  = x^  a.s.  Let  k be  some  positive  integer 
and  for  v = 0,  1,  ...,  k let  f^:E->B.  Let  r^  = rrf^  = E(  f^(X)) 
(assume  that  Tr'f^l  < ») . For  each  m>  1 and  v = 0^  ...^  k define 
Yjv)  by 


(2.U) 


Y (v)  = Z f (X  ) 
m'  ' n'' 

n=T 
m-  1 


Because  of  the  regenerative  nature  of  the  Markov  chain;  i.e.,  because 

the  V^’s  are  i.i.d.,  ((YjO),  ...,  Yjk),  T J , m>  1}  are  i.i.d. 

random  vectors  (although  for  a fixed  m.  Y (O) Y (k)  and  T 

' m ' ' ' ra  m 

are  in  general  dependent  random  variables).  The  following  proposition 
gives  an  expression  for  r^  (in  terms  of  Y^(v)  and  t^)  that  is  useful 
in  simulation.  The  proof  of  the  proposition  may  be  found  in  Crane  and 
Iglehart  ( I975) . 
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(2.15)  PROPOSITION.  If  Trif.l  <»  then  r = Tf  =E  (Y  (v))/E  (t). 

' ' V ' V V m'  •" 


From  now  on  we  will  drop  the  "Xq"  in  E^  ( ) with  the  jnderstanding 

that  PfX..  = x^]  = 1.  Now  set  Z (v)  = Y (v)  - r t . By  the  previous 
^ 0 O'"  m'  ^ m'  ^ V m ^ 

proposition  we  have  for  each  v = 0,  each  m > 1 


(2.16) 


E(ZJv))  = 0 . 


For  each  i = 0^  ...  , k and  j = 0^  . . . ^ k let 


(2.17) 


“ij  - 


and  write  for  Let  be  the  symmetric  (k+1)  by  (k+1) 

dimensional  matrix  whose  (i^j)th  entry  is  is  then  the 

covariance  matrix  of  the  Z (v)  's.  We  assume  that  each  element  of  E, 

m k 

is  finite  and  that  Ej^  is  positive  definite  (in  general  it  is  positive 
semidefinite) . 

We  now  state  some  additional  limit  theorems  for  Markov  processes 
from  which  point  estimates  and  confidence  intervals  for  each  r^  can 
be  obtained.  Let 


(2.18) 


(2.19) 


M M 

^ (M)  = Z Y (v)/  Z T , 
V 1 in  ,111' 

m=l  iftsl 


/N  1 ” 

X (N)  = Z f (X  ) , 

v''  ’ N+1  - v'  n'  > 


for  each  v = 


0^  1^  . . . , k. 


According  to  the  following  strong  laws  of  large  numbers  we  can 
derive  strongly  consistent  point  estimates  for  r^  using  either  r^(M) 
or  x^(N).  The  proof  may  be  found  in  Crane  and  Iglehart  (1975)- 


(2.20)  PROPOSITION.  If  Tr|f^|  < then 


r 


(M) 


a.  s.  as  M -♦  00  , 


x^(N)  -*  r^  a.s.  as  N -♦  oo  . 

Notice  that  is  an  estimator  for  r^  based  on  M cycles 

of  the  process  and  x^(N)  estimates  r^  based  on  N transitions  of 
the  process. 

To  derive  confidence  intervals  for  the  I’y's  we  can  use  a multi- 
dimensional central  limit  theorem.  If  A is  a symmetric  positive 
definite  (k+1)  by  (k+1)  dimensional  matrix  let  N(0^  A)  denote  a 
(k+1)  dimensional  random  vector  having  a multivariate  normal  distribution 
with  means  0 and  covariance  matrix  A.  For  any  real  number  a ^ 0 let 
A/a  be  the  (k+l)  by  (k+1)  dimensional  matrix  whose  (i^j)th  entry  is 

Aij/a. 

(2.21)  PROPOSITION.  If  Tr|f^|  <«  for  each  V = 0,  1,  ..., 
is  a finite  positive  definite  matrix  then 


Ik 


k and  if 


(v^i  (^q(M)  - v^i  - ^))  ->N(0,  Z^E^(Tp) 

(^  (?j(N)  - Iq),  ...,  S (Xj^(N)  - Tj^)  =^N(0,  £j/E(t^))  . 

PROOF.  Apply  the  Cramer-Wold  device  described  on  page  48  of  Billingsley 
(1968)  to  the  central  limit  theorem  given  in  equation  (5.3)  of  Crane 
and  Iglehart  (1975).  ^ 

Now  let  g be  a (k+1)  dimensional  row  vector  of  real  numbers 

As  As 

whose  vth  entry  is  P(v).  Let  r,  £(M)  and  x(N)  denote  (k+1) 

As  A^ 

dimensional  column  vectors  whose  vth  entries  are  r^, 
respectively.  If  A is  a matrix  let  A'  denote  the  transpose  of  A; 
i.e.^  A.'^  = ^ simple  application  of  the  continuous  mapping  theorem 

(Theorem  (5.1)  o^  Billingsley  (1958]),  yields; 

k k 

(2.22)  PROPOSITION.  Let  cr^(g)  = g E g'  = Z Z P(  i)  cr..  P(j). 


i=0  j=0 


Under  the  hypotheses  of  Proposition  (2.21), 
VM  (g  r(M)  - gr) 

and 


as  M — > 00 


V 


VN  (g  3(N)  - gr) 


N(C,1) 


as  N 


k 

Note  that^  for  example^  ^ P(v)  r . A central  limit  theorem 

v=0  '' 

for  an  individual  r^  can  be  formed  using  Proposition  (2.22)  by  setting 
P(v)  = 1 and  P(  i)  =0  for  i v.  In  this  case  the  limit  theorems 
become^ 

Vm  (r  (M)  - r ) 

(2.25)  

Vn  (x  (N)  - r ) 

(2.21+)  — ^ =>N(0.1)  as  N . 

In  order  to  form  confidence  intervals  for  the  (or  linear  combinations 

of  the  r 's)  it  is  necessary  to  know  the  covariance  terms  a, . as  well 
V ij 

as  E(  Tj^) . In  a simulation  these  constants  are  usually  unknown  and  must 
be  estimated.  In  addition  g may  be  a fixed^  but  unknown^  vector  so  it 
too  must  be  estimated.  The  following  proposition  tells  us  that  we  may 
replace  these  quantities  in  Proposition  (2.22)  by  any  sequence  of  strongly 
consistent  estimators  and  not  destroy  the  asymptotic  normality. 

(2.25)  PROPOSITION.  Suppose  that  t^(M)  ^ E(  t^^)  a.s.,  CT^j(M)  ^-s. 

for  each  i and  j,  and  that  p( i^M)  -» p( i)  a.s.  for  each  i.  Let 
Z.  (M)  be  the  matrix  whose  (i^j)th  entry  is  a .(M),  let  g(M)  be  the 

K X j 

vector  whose  ith  component  is  p(i^M)  and  let  aj^(g,M)  = g(M)  2j^(M)  g'(M). 
Then 

Vm  (g(M)  2(M)  - g(M)r) 

— — — =>N(0,1)  as  M -» 00 
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PROOF.  By  Theorem  k.k  of  Billingsley  ( I968)  and  Proposition  (2.22)  we 


(Tj^(M),  g(M),  yM),  Vm  (£(M)  - r)) 


(E(Tl)  g,  N(0,  2^e2(tP)) 


We  then  have 


(\(g,M)/T^(M),  Vm  (g(M)  r(M)  - g(M)r)) 


(<^k^g)/E(^l),  N(0,  cT^(g)/E^(T^))  , 


/ ^ Vm  (g(M)  r(M)  - g(M)r)v 

(tVe.»)/\(»)>  a;fe)/E7T^) , 


and  finally 


Vm  (i(M)  2(M)  - g(M)r) 


N(0,1)  , 


the  last  three  steps  all  being  justified  by  the  continuous  mapping  theorem.  □ 


3.  Variance  Reduction  Techniques 


In  this  section  we  apply  the  results  of  Section  2 and  take  further 
advantage  of  the  structure  of  Markov  processes  to  obtain  variance  reductions. 
Let  us  now  fix  a function  f:E  -»]R  and  as  before  let  r = vf.  Our  goal 
is  to  obtain  both  point  estimates  and  "short"  confidence  intervals  for  r. 
This  will  be  achieved  by  forming  several  estimators  for  r and  then 
taking  the  (asymptotic)  minimum  variance  linear  combination  of  these 
estimates  which  is  strongly  consistent.  In  order  to  form  these  multiple 
estimates  some  additional  calculations  must  be  done  both  before  and  during 
the  simulation,  but  hopefully  their  cost  will  not  be  so  great  as  to  pro- 
hibit the  use  of  this  method. 

The  multiple  estimates  for  r are  formed  by  choosing  new  functions 
f^:E  so  that  irf^  = r for  each  value  of  v.  The  values  of  V for 

which  this  is  done  will  typically  be  "small"  and  labeled  as  (0,1, ...,k}. 

Once  the  have  been  computed  we  simulate  the  process  for,  say,  M 

cycles.  As  in  Section  2 let 


Y 

m 


(V) 


T -1 

IQ 

Z 


n=T 


f (x  ) 
v'  n' 


m- 1 


and  define 


M 

,(v)/  s 

m=l 


"m  • 
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Since^  assuming  Trlf^l  < oo^  r^(M)  -♦  E(Y^(v))/E(  = Tf^  = r,  each  r^(M) 

is  a strongly  consistent  estimator  for  r so  that  we  could  use  any  one 
of  them  to  estimate  r.  Actually  we  can  do  significantly  better  than 
that  by  using  r^fM),  simultaneously  to  estimate  r.  If  we 

let  {p(v):v  = 0,  k)  be  any  constants  so  that 


(3.1) 


and  then  set 


Z P(V)  = 1 


(3.2) 


rB(M)  = E 0(v)  ^ (M) 
^ v=0 


we  have  ^^(M)  ^ a.s.  as  M -»  cc.  The  values  of  p(v)  are  then  chosen 

/N 

to  minimize  the  asymptotic  variance  of  rp(M) . Details  of  the  choice  of 

the  P(v) 's  will  be  presented  later. 

We  now  turn  to  the  selection  of  the  functions  f^.  In  this  paper 

we  will  concentrate  on  only  one  (actually  the  simplest)  way  to  choose  the 

f^'s.  In  a subsequent  paper  we  will  study  alternate  methods  of  choosing 

the  in  the  case  when  the  state  space  is  finite. 

As  our  current  choice  of  f we  let 

V 


(3.3) 


f = P f , 

V * 


for  V = 0,  . . .,  k. 


Assuming  that  the  following  steps  can  be  justified  (which  they  will  be) 
we  have 

V-1 

= wP(  P f) 


'’-1 

= f(P  f)  (since  ttP  = tt) 


So  all  the  are  equal.  Noting  that  f^  = P® f = f we  have  = Trf  = 

and  so  = r for  all  v.  The  following  proposition  formalizes  this 
idea  by  showing  that  irf^  = Tr(Pf)  = r.  The  fact  that  irf^  = r for  all 
V > 0 then  follows  easily  by  induction. 

(3.4)  PROPOSITION.  If  Trlfl  <00^  then  7rf  = ir(Pf). 

PROOF.  Recall  that  irf  = f 7r(dy)  f(y).  Assume  for  the  moment  that 
f > 0.  We  first  prove  the  proposition  for  functions  f which  are 
indicators;  i.e.^  f(y)  = ^-jgjCy)  3°”'®  B € £, 
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fl(x)  = ; P(x,  dy)  f(y)  = / P(X,  dy)  Ijg^(y) 


= / P(x,  dy)  = P(x,B) 
B 


We  therefore  aave 

Tf^  = / 7r(dx)  P(x,B) 

= 7r(B)  by  (2.12.C) 

= f Tr(dy) 

B 

= / 7r(dy)  = -n-f  • 


Since  the  proposition  is  true  for  indicator  functions  it  is  true  for 
simple  functions;  i.e.^  functions  which  are  finite  linear  combinations 
of  indicators.  Now  let  f^"^^  be  a sequence  of  simple  functions 
increasing  to  f.  By  the  monotone  convergence  theorem 


(5.5) 


TTf 


lim 


If  we  let  = Pf^''^,  then 


Pf(x)  = / P(x,  dy)  f(y) 

= lim  / P(x,  dy)  f^'^^(y)  , 

n -» oo 

(5.6)  Pf(x)  = lim  4"^(^)  , 

n — » 00 
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again  by  the  monotone  convergence  theorem.  Furthermore  for  each  x the 
convergence  in  (5-6)  is  monotone.  Now  since  the  proposition  is  true  for 
simple  functions,  we  have 


(5.7)  TTf^''^  = . 

The  left  hand  side  of  (5.?)  converges  to  irf  by  (5-5)  so  that 

TTf  = lim  irf^''^ 
n -» 00 

= lim  / Tr(dx)  f^'^^x) 
n 00 

= / fr(dx)  ( lim  i:^”^(x)) 
n 00  ^ 

= f Tr(dx)  Pf(x)  by  (5.6) 

= Tr(Pf) 

The  last  interchange  of  limit  and  integral  is  once  again  justified  by 
the  monotone  convergence  theorem.  Finally,  since  ttI  f | < oo,  the  result 
is  true  for  a general  f by  writing  f(x)  = f^(x)  - f (x)  where 
f''’(x)  = max(0,  f(x))  and  f"(x)  = - min(0,  f(x))  and  applying  the 
result  to  both  f^  and  f . □ 

We  now  return  to  the  minimization  of  the  asymptotic  variance  of 
the  estimator  *^p(M)  defined  in  equation  (5.2).  By  Proposition  (2.21) 
we  have  the  following  central  limit  theorem; 
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=S>N(0,1) 


as  M ->  CO 


TM  ( V (3(v)  z (H)  - 3(v)r)) 
v=0  


where  is  defined  in  (2.22).  If  the  3(v)  's  sum  to  one  (equation 

(3-i))  we  can  rewrite  this  as 


(5.8) 


VM  (^^(M)  - r) 
\(g)/E(^) 


so  that  we  can  form  confidence  intervals  for  r based  on  the  estimator 

r (M).  Since  we  are  free  to  pick  g in  any  way  we  please  (subject  to  (3.1)), 
P 

2 

we  select  £ = where  g*  minimizes  the  variance  term  • This 

will  produce  the  smallest  possible  confidence  interval  for  r.  Note  that 
there  is  no  reason  to  restrict  g to  be  nonnegative;  i.e.,  ’^^(^)  need 
not  be  a convex  combination  of  the  minimize  the  variance 

we  must  solve  the  nonlinear  programming  problem; 

minimize 

k 

subject  to  D 0(v)  = 1 

V=0 

We  can  write  this  in  matrix  notation  by  letting  e denote  the  (ki-l) 
dimensional  row  vector  each  of  whose  components  is  1.  Our  problem  then 
becomes 
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(5.9) 


minimize 


subject  to  eg'  = 1 

It  is  straightforward  (using  Lagrange  multipliers)  to  show  that 

(5.10)  g*  = £ 2’Ve  e'  , 

(3.11)  \(g*)  = l/£  e’  • 

This  selection  of  multipliers  is  similar  to  those  given  on  page  I9  of 
Hammersley  and  Handscomb  ( I96L) , Lavenberg  (197^^  ^*'*1  Lavenberg  and 
Shedler  (1975).  We  have  dealt  here  with  forming  short  confidence  intervals 
based  on  a run  of  length  M cycles.  The  same  technique  applies  to  a run 
of  N transitions.  In  this  case  we  must  still  solve  the  nonlinear  pro- 
gram given  in  (3.9)  because  the  variance  terms  in  the  central  limit 
theorems  (2.21)  differ  only  by  a constant  multiple.  Equations  (3. 10)  and 

(3.11)  are  therefore  still  valid  in  this  case.  If  we  let 

(5.12)  X (N)  = E 0(v)  x^(N) 

we  then  have  the  two  central  limit  theorems; 


2L 


r 


(5.15) 


VM  (r  (M)  - r) 


(5.1^) 


VN  (Xg^(N)  - r) 


=>N(0,1)  . 


Since  the  covariance  matrix  is  in  general  unknown  it  becomes 
necessary  to  estimate  If  is  any  estimator  such  that 

Ej^(M)  -♦  a.s.  as  M -♦  «<>  then  -» E*  Letting 

(5.15)  |*(M)  = e E;\M)/e  e;\m)  e'  , 

/S 

it  is  clear  that  g*(M)  a.s.  as  M -♦ ».  Applying  this  result  with 

Proposition  (2.25)  we  can  maintain  the  asymptotic  normality  even  when  g* 
must  be  estimated.  Letting 


(5.16) 


(M)  = z P*(V,M)  ? (M)  , 

V=0 


we  then  have  the  central  limit  theorem^ 


(5.17) 


VM  [r^  (M)  - r) 

P* 

a(g*(M))/’T^(M) 


N(0,1) 


f 


where  a(g*(M))  is  defined  in  (2.25)  ^nd  Tj(M)  is  any  sequence  of 
numbers  such  that  ^^^(M)  ->E(Tj^)  a.s.  A corresponding  central  limit 
theorem  exists  for  the  x^(N) 's.  Because  this  method  combines  several 


% 
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different  estimates  of  the  same  quantity  we  call  it  the  "method  of  multiple 
estimates". 

In  order  to  apply  this  method^  the  functions  f^  must  be  computed 

(usually  before  the  start  of  the  simulation).  For  computational  efficiency 

f^  can  be  defined  recursively  by  f^  = f and  f^  = Pf^  ^ for  V > 1. 

This  avoids  having  to  compute  the  V-step  transition  function  a 

potentially  great  computational  savings.  If  the  state  space  is  finite 

and  the  transition  matrix  is  sparse  the  work  involved  in  calculating  f^ 

for  a few  values  of  V may  not  be  too  great.  If  the  computed 

before  the  start  of  the  simulation  we  must  be  able  to  store  them;  this  may 

be  a considerable  problem  if  the  state  space  is  very  large. 

We  note  that  to  form  the  estimates  (or  r^(M))  we  must 

evaluate  f (X  ) for  each  value  of  v and  each  transition  n.  This 
V'  n"^ 

will  tend  to  increase  the  amount  of  time  needed  for  each  transition 

simulated.  However^  if  the  variance  reduction  obtained  is  sufficiently 

large  the  potential  savings  in  the  number  of  transitions  that  need  to  be 

simulated  will  more  than  offset  the  extra  work  per  transition.  This  will 

be  discussed  in  greater  detail  in  Section  4.  We  also  note  that  additional 

work  must  be  done  at  the  end  of  each  cycle  to  update  the  estimates  of  the 

covariance  matrix  (using  no  variance  reducing  technique  we  need  only 

2 

update  ^q) • However  if  k is  small  and  if  cycles  tend  to  be  long  this 
should  not  have  a substantial  impact  on  the  CPU  time. 

It  should  be  mentioned  that  if  one  has  a choice  of  more  than  one 
return  state  the  variance  reductions  that  are  obtained  using  this  method 
are  (in  theory)  independent  of  the  return  state.  The  reader  should  consult 
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either  Chung  ( I967)  or  Crane  and  Iglehart  ( I975)  for  this  point.  For 

practical  reasons  it  is  recommended  that^  if  possible^  the  return  state 

be  chosen  so  that  cycles  are  not  excessively  long. 

We  now  examine  the  selection  of  the  parameter  k.  As  the  value  of 
2 

k increases  decreases.  This  is  because  the  kth  minimization 

problem  is  the  same  as  the  (k+l)st  problem  which  has  the  additional  con- 
straint that  p(k+l)  = 0.  This  means  that  as  we  do  more  computation  we 
can  get  increasingly  accurate  estimates  of  r. 

If  the  state  space  is  finite  we  have 

f (i)  = E f(j)  -»  Z TT  f(j)  = r as  k , 

j€E  j€E  J 

so  that  for  large  values  of  k^  f^^  will  be  a smoother  function  than  f. 

The  point  estimate  is  then  the  average  of  N+l  terms  each  of 

which  is  close  to  r so  that^  for  large  ^ smaller 

variance  than  Xq(N)  , In  fact  it  can  be  shown  that  -»  0 as  k ^00, 

/S 

Sy  placing  all  weight  on,  x^(N);  i.e.,  by  setting  P(k)  = 1 and  g(v)  = 0 

2 2 2 

for  V ^ k,  we  see  that  Furthermore, 

since  in  the  case  of  a finite  state  space  converges  exponentially 

to  r;  i.e.,  there  exist  constants  c,  > 0,  1 > Cg  > 0 such  that 
Ilk  2 

I fj^(  i)  - r|  < Cj^Cg,  the  rate  at  which  converges  to  0 is  also 

exponential  ( see  Doob  ( 1955) ) • 

For  many  types  of  Markov  chains  we  can  expect  substantial  variance 
reductions  even  when  k is  relatively  small  (say  2 or  3) . For  countable 
E we  have 
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r 


(5.18) 


V‘) 


Thus  if  the  Markov  chain  makes  transitions  only  to  "neighboring"  states 

and  if  f(j)  is  close  to  f( i)  for  j "near"  to  i it  can  be  seen 

from  equation  (5.I8)  that  for  small  k^  should  be 

nearly  the  same.  This  means  that  and  Xq(N)  will  be  highly 

correlated^  a condition  that  generally  results  in  good  variance  reduction. 

Many  queueing  networks  exhibit  this  special  type  of  structure. 

Ideally  one  would  like  to  be  able  to  select  the  "optimal"  value 

of  k in  the  sense  that  for  a given  computer  budget  we  would  like  to 

pick  the  value  of  k which  gives  us  the  smallest  confidence  intervals 

for  r ( part  of  the  budget  must  be  allocated  to  the  calculation  of  the 

f^'s).  To  perform  such  an  optimization  one  would  have  to  know  (or  have 
2 

estimates  for)  for  each  V > 0.  These  quantities  are  in  general 

unknown  and  even  to  estimate  them  would  require  calculating  the  fy'® 

and  then  simulating  the  Markov  process  for  a number  of  cycles.  The 

disadvantage  of  such  a procedure  is  that  one  may  invest  considerable  time 

in  the  computation  of  f^  only  to  find  that  the  calculation  was  wasted; 

i.e.^  the  variance  reduction  obtained  by  using  f^  is  not  sufficiently 

great  to  justify  the  added  cost  of  the  calculations. 

One  possible  approach  to  this  problem  is  a sequential  one.  Suppose 

one  has  calculated  f.,  f, f,  and  has  estimates  for 

0^  1^  ' k 

2 2 2 

Oq,  •••,  These  estimates  could  then  be  plotted  and  an 

2 

estimate  for  could  be  obtained  by  extrapolation.  It  could 

then  be  decided  (based  on  these  estimates)  whether  or  not  f,  , should 
' ' k+1 
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1 


be  calculated.  The  success  of  such  a procedure  depends^  of  course^  on 

2 

how  accurate  our  estimates  for  {o^(0*):O  < V < kj  are  as  well  as  the 
validity  of  the  extrapolation. 


We  feel  that  the  inability  to  predict  the  variance  reductions  in 
advance  is  the  major  drawback  of  the  method  ( this  is  also  a drawback  in 
other  more  standard  control  variable  methods).  The  simulator  must  be 
very  careful  to  apply  the  method  only  in  those  cases  where  it  is  efficient 
to  do  so.  Generally  speaking  the  success  of  this  technique  depends  on 
one's  ability  to  efficiently  compute  and  store  the  functions  f^. 

The  method  of  multiple  estimates  can  be  extended  to  certain  types 
of  continuous  time  processes  such  as  continuous  time  Markov  chains  and 
semi-Markov  processes.  The  basic  idea  here  is  to  simulate  the  imbedded 
discrete  time  Markov  chain  of  the  process  and  hold  the  process  in  a state 
for  a deterministic  amount  of  tim.e  equal  to  the  expected  holding  time 
of  the  state.  We  thus  transform  the  continuous  time  process  into  a 
discrete  time  process.  Once  the  appropriate  function  f for  the  discrete 
time  process  is  formed  the  method  proceeds  exactly  as  before.  The  reader 
is  referred  to  Hordijk,  Iglehart  and  Schassberger  ( I976)  for  a more  detailed 
account  of  this  transformation.  Several  of  the  examples  in  Section  k 
were  treated  in  this  manner. 

Since  it  is  often  inconvenient  to  actually  simulate  the  imbedded 

discrete  time  Markov  chain  we  present  an  analysis  that  shows  how  the 

method  can  be  applied  to  continuous  time  Marlcov  chains  which  are  simulated 

in  continuous  time.  Let  t > 0}  be  a continuous  time  Markov  chain 

with  transition  matrix  P..(t)  = P(X..  = jix  = i).  Let 

ij'  ' t 0 
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Q = € E}^  and  (assume  0 < < oo)  . We  also  assume 

that  the  process  is  positive  recurrent  and  irreducible.  Let  tt  = (ir^^ri  € E) 
be  the  stationary  distribution  of  the  process.  It  is  then  known  (see 
Cinlar  (1975))  that 


(3.20) 


ttQ  = 0 . 


Let  f:E  r = irf,  and  assume  that  fUI  < » as  before.  We  again 

are  interested  in  finding  functions  f^  so  that 


r 


Trfv(  = 


1 T 

Urn  - / f (X  )ds) 
T -*00  ^ 0 ^ 


for  each  v. 

Writing  (3.20)  out  we  see  that 


0 = 


Z TT  q 
ieE  ^ 


= Z ir.  q.  . - 


TT.  q. 
J J 


which  reduces  to 


TT. 


i = Z TT  q.7q 

J i IJ  J 


(3.21) 


Define  the  matrix  A = (a^.;i,i  £ EJ  by 

I ij 


(5.22) 


if  i = j , 


q.  ./q.  if  i / j . 

ly  j 


Equation  (5.21)  then  becomes  (in  matrix  notation) 


(5.25) 


ir  = ttA  . 


We  therefore  have  Trf  = TrAf  so  that  if  we  let  f^  = Af  we  have 
Trf  = Tfy  By  induction  if  we  let 


(;.2M 


for  V > 0 


then  vf^  = irf  for  all  v > 0 (define  A = the  identity^  so  that 
= f as  before) . 

Let  us  assume  (without  loss  of  generality)  that  we  now  pick  0 

as  our  return  state.  For  m > 1 let  p be  the  length  of  the  mth  visit 

— '^m 

to  state  0.  Define  the  return  times  T by 

m 


(5.25)  DEFINITION.  Tq  = 0, 


T = inf(s  > T + p :X  = 0)  , 
m m- 1 in  s ^ 


for  m > 1 
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Let  be  defined  by 

T 

m 

(5.26)  Yjv)  = / f^(X^)ds  . 

^m-1 

Since  we  still  have  r = E( Y^( v) )/E( t^) ^ we  can  now  proceed  exactly  as  in 

the  discrete  time  case.  The  important  thing  to  notice  is  that  we  only 

need  the  infinitesimal  matrix  Q to  form  the  functions  f . The  actual 

V 

simulation  of  the  process  can  take  place  in  the  most  straightforward  manner. 
If  we  convert  the  process  into  discrete  time  using  the  methods  of 
Hordijk  et  al.  ( I976)  and  then  apply  the  method  of  multiple  estimates  we 
generate  a sequence  of  functions  which  will  be  denoted  by  f^.  It  can 
then  be  shown  that 

(5.27)  f^(i)  = f^(i)/q. 

where  ^y( defined  in  (3.2k).  Recall  that  is  the  mean  of 

the  exponentially  distributed  holding  time  of  state  i. 

To  apply  the  method  to  semi-Markov  processes  simulated  in  continuous 
time  we  cannot  use  (3.2k)  to  calculate  our  functions  f^.  Instead  we  form 
the  functions  fj  by  using  the  discrete  time  methods  of  Hordijk, et  al. 

( 1976)  and  then  if  q^^  is  the  mean  holding  time  of  state  i we  let 


(5.28)  yi)  = ^(i)  q^  . 


1 

It  can  then  be  shown  that  again  Tf^  = irf  for  all  v > 0^  where  tt  is 
now  the  stationary  distribution  of  the  semi-Markov  process.  We  then 

define  and  (3*25)  ^nd  (3.26)  and  proceed  as  before.  j 

As  a further  extension  of  the  method  we  can  drop  both  the  Markovian 
and  regenerative  assumptions  on  need  to  add  the 

hypotheses  that  the  process  is  stationary  and  satisfies  an  asymptotic 

independence  condition  known  as  cp-  mixing  (see  Billingsley  (I968)).  Let  i 

i 

(X^}  be  a strictly  stationary  process  with  state  space  E.  For  integers  ' 

a < b let  ^ be  the  sigma  field  generated  by  X^^  ...,  Xj^  (with  the 
obvious  extensions  for  a = -co  and  b = +«>) . 

(5.29)  DEFINITION.  We  say  that  {X^}  is  cp-mixing  if  there  exists  a 
sequence  of  numbers  (cp(n)}  such  that  for  each  k (-00  < k < 00)  and 

n > 1,  e ^ 

Ip(E^  n Eg)  - P(E^)  P(E2)|  <cp(n)  P(E^)  . 

Assuming  that  P(E^)  >0  we  can  divide  both  sides  of  the  inequality  in 

(3.29)  by  P(E2^)  to  obtain  the  equivalent  condition 

(3.30)  |P(E2|E^)  - P(E2)  I < cp(n)  . 

If  q)(n)  -»0  as  n -> « (3.30)  says  that  events  in  the  distant  future 

are  nearly  independent  of  the  past.  Let  f:E  -»]R  and  let  r = E(f(X^)). 

We  are  now  interested  in  estimating  r.  Define 
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for  some  fixed  finite  integer  a.  It  can  then  be  shown  that  the  process 
lYn(v))  is  also  stationary  and  cp-mixing  (with  a different  qj) . Further- 
more if  E(|f(Xj|)  < 00^  then 

(3.32)  = E(YJv))  = E(E(£{X__^„)|^_^)) 

= E(£(X  )) 

' ^ n+v 

= r (by  stationarity)  . 

We  can  again  try  to  apply  the  method  of  multiple  estimates  to  estimate  r. 
To  do  so  let 

(3.35)  V''> 

n=0 

(3.5M  yv)  = Y^(v)  - r . 

We  can  row  use  the  multidimensional  central  limit  theorem  for  q;-mixing 
processes  given  in  Billingsley  ( I968) . 


1/0 

(5.35)  PROPOSITION.  If  Z <p(n)  ' < cx>  and  E(  1 f(X  )|)  < »,  then 

n=0  " 

($q(N)  - r),  ....  (^N)  - r))  =^.N(0,  Vj^) 


where  Vj^  is  assumed  to  be  a symmetric  (k+1)  by  (k+l)  dimensional 
positive  definite  matrix  with  finite  entries  defined  by 


(3.36)  V = E(ZQ(i)  Z (J))  + y E(ZQ(i)  Z (j))  + Z E(Z^(i)  ZqCJ)) 

n=  1 n=  1 


We  can  now  proceed  as  before.  The  terms  are  usually  unknown  and 

must  be  estimated^  frequently  a formidable  task.  In  addition  the  calculation 

of  be  very  difficult.  It  will  depend  on  the  structure  of  the 

underlying  stochastic  process  unlike  in  the  Markov  case^  no 

neat  computational  formulas  ( like  f^  = p'^f)  can  be  given  for 

E(f(X  )(x  X X ).  Because  of  these  difficulties  no  numerical 

' n+V'  n>  n-1’  ’ n-a' 

examples  of  this  type  were  tried^  although  further  research  in  this  direc- 
tion is  planned. 
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i*-.  Examples 

To  find  out  how  well  the  method  performs^  four  test  problems  were 
selected.  The  four  problems  all  come  from  the  area  of  queueing  theory. 

They  are  the  queue  length  process  in  a finite  capacity  m/m/ 1 queue^  the 
queue  length  process  in  the  repairman  problem  w^-th  spares^  and  the  waiting 
time  processes  in  both  the  1 and  queues.  These  processes  were 

chosen  because  analytic  results  are  readily  available^  thereby  making  a 
comparison  between  analytic  and  simulation  results  possible.  Despite 
their  simplicity  they  are  by  no  means  "easy"  processes  to  simulate^ 
particularly  the  heavily  loaded  queues  which  require  very  long  run  lengths 
(see  Table  1)  to  get  good  simulation  estimates.  For  all  four  types  of 
processes  substantial  variance  reductions  have  been  realized  (although 
in  the  m/m/2  queue  with  p = .9  the  reduction  in  variance  is  not  enough 
to  justify  the  use  of  the  method) . ^Jhile  it  is  difficult  to  predict  the 
variance  reductions  that  will  be  obtained  for  a particular  stochastic  pro- 
cess^ it  is  felt  that  this  method  shows  a great  deal  of  promise  and  deserves 
serious  consideration  when  planning  a simulation  experiment.  The  remainder 
of  this  section  will  be  devoted  to  a more  detailed  description  of  the 
examples  and  a presentation  of  their  numerical  results. 


I 

I 
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(4.1)  EXAMPLE.  Finite  Capacity  M/'i/l  Queue 

Let  {X( t) ^ t > 0}  be  a birth  and  death  process  with  parameters 


X. 

1 


i = 0,  1, 


M-1 


i > M 


i 


1 


2 


) 


M 


where  M is  a finite  positive  integer  and  0 < pi  < oo.  The  process 

{X(t)^  t > 0}  is  then  the  queue  length  process  for  an  M/M/1  queue  with 

finite  queue  capacity  M.  In  this  system  customers  arrive  according  to 

a Poisson  process  with  rate  however  any  customer  arriving  when  there 

are  M customers  already  in  the  system  is  denied  entrance  to  the  queue 

and  departs  never  to  return.  The  i.i.d.  service  times  have  an  exponential 

distribution  with  mean  pi  The  state  space  is  E = {0^  1,  ...^  Mj.  We 

investigate  the  variance  reductions  for  three  different  functions  f; 

2 

f( i)  = i^  f( i)  = i and  f( x)  = l^Qj(i).  The  corresponding  r's  for 

2 

these  functions  are  E(X  ),  and  P{X  = 0}  respectively.  By  applying 

the  discrete  time  methods  of  Hordijk,et  al.,  ( I976)  we  can  solve  systems 

* 2 

of  linear  equations  to  find  r^  g and  M has  been  chosen 

to  be  14  so  that  these  systems  of  equations  may  be  easily  solved  for  a 
wide  variety  of  parameter  values.  Let  p = Vu-  Because  the  state  space 
is  finite  the  Markov  chain  is  positive  recurrent  for  all  values  of  p 
(unlike  the  infinite  capacity  queue  which  requires  p < 1)  . For  small 


57 


values  of  p the  capacity  constraint  plays  little  role  so  that  behavior 

of  the  finite  capacity  and  the  infinite  capacity  queues  should  be  nearly 

the  same.  To  help  the  reader  better  assess  the  effect  of  the  capacity 

limitation  we  note  that  for  the  infinite  capacity  M/M/1  queue  E(X)  s p/(l--p), 

E(X^)  = p(Up)/(l-p)^  and  P(X  = 0)  = 1-p. 

For  the  numerical  results  that  follow  we  have  slightly  modified  the 
2 

definition  of  to  make  it  the  asymptotic  variance  term  in  the 

central  limit  theorem  for  x (M),  It  therefore  takes  into  account  all 

l/p  2 2 2 

constant  multiples  such  as  E(Tj^)  . Let  Rj^  = To  obtain 

/\ 

confidence  intervals  of  equal  length  if  we  use  the  estimator  we 

* rt'W' 

2 . 

need  only  simulate  times  as  many  transitions  as  would  be  needed  using 

no  variance  reduction  technique  ( Chat  is  if  we  used  just  the  regular  point 

/s 

estimate  x^) , For  a fixed  (large)  number  of  simulated  transitions^ 

/\ 

the  length  of  the  confidence  interval  for  r using  Xp^(N)  divided  by 

/\ 

the  length  of  the  confidence  interval  for  r using  Xq(N)  is 
2 

R = . R,  and  R are  the  usual  efficiency  measures  of  a 

variance  reduction  technique, 

2 2 

Tables  2^  5 and  4 list  p^  r^  Rj^,  and  R^^  for  k = 1^  2,  5. 

2 

For  each  k R,  is  listed  directly  below  R As  an  example^  in  Table  2, 

iC  iC 

for  p = .5  we  see  that  to  obtain  confidence  intervals  for  E(X)  = .9995 

of  equal  length  we  need  only  simulate  5*24^  as  many  transitions  using 

x_j,  (with  k = 3)  than  using  just  x , For  the  samenumber  of  transitions 

u 

simulated  the  ratio  of  the  lengths  of  confidence  Intervals  is  .2288.  Notice 
that  the  choice  of  the  function  f influences  Che  variance  reductions  as 
does  the  value  of  p.  Generally  speaking  as  the  traffic  intensity^  p^ 
Increases  the  variance  reduction  obtained  decreases. 
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TABLE  2 


Calculaced  Variance  Reductions  for  Finite 
Capacity  Queue;  r = E(X) 


p 

E(X) 

2 

<^0 

"1 

^1 

"2 

^2 

“5 

'5 

.10 

.1111 

.0244 

.0454 

.2156 

.0045 

.0674 

.0005 

.0213 

.20 

.2500 

.2312 

.0926 

.3045 

.0185 

.1361 

.0037 

.0609 

.30 

.1^286 

1.469 

.1415 

.3759 

.0424 

.2053 

.0127 

.1126 

.IfO 

.6667 

5.888 

.1905 

.4364 

.0756 

.271^9 

.0297 

.1725 

.50 

.9995 

21.59 

.2541 

.4858 

. 1121 
.33^7 

.0524 

.2263 

.60 

1.1^93 

76.57 

.2601 

.5100 

.1352 

.3677 

.0631 

.2610 

.70 

2.262 

250.9 

.2884 

.5370 

.1395 

.373^ 

.0685 

.2615 

.80 

3.iv53 

670.2 

.4094 

.6399 

. 1623 
.4023 

.0692 

.2631 

.90 

5.111 

1262 

.6050 

.7778 

.2659 

.5156 

.1143 

.3338 

.93 

6.052 

1476 

.6880 

.8294 

.3i^25 

.5852 

.1607 

.4009 

.99 

6.813 

15^+8 

.7404 

.8605 

.4056 

.6369 

.2047 

.i^525 
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TABLE  5 


Calculated  Variance  Reductions  for  Finite 

2 

Capacity  n/n/ I Queue;  r = E(X  ) 


p 

E(X^) 

CVl  0 
b 

"2 

.10 

.1558 

.1241 

.0271 

.1647 

.0044 

.0664 

.0004 

.0210 

.20 

.5750 

2.508 

.0442 

.2105 

.0150 

.1159 

.0026 

.0509 

.50 

.7959 

22.64 

.0564 

.2576 

.0222 

.1488 

.0065 

.0808 

.40 

1.555 

156.7 

.0627 

.2505 

.0285 

.1688 

.0114 

. 1066 

.50 

2.992 

974 

.0628 

.2507 

.0520 

.1790 

.0145 

.1194 

.60 

5.875 

5477 

.1055 

.5245 

.0516 

.1778 

.0120 

.1094 

.70 

11.82 

25,787 

.2681 

.5178 

.0775 

.2784 

.0276 

.1662 

.80 

25.42 

90,654 

.4857 

.6969 

.1969 

.4458 

.0905 

.5001 

.90 

42.66 

211,556 

.6512 

.8070 

.5455 

.5859 

. 1870 
.4525 

.95 

54.75 

270,711 

.7078 

.8415 

.4089 

.6595 

.2568 

.4865 

.99 

65.06 

505,811 

.7425 

.8617 

.4545 

.6742 

.2757 

.5252 

UO 

ik 


TABLE  L 


Calculated  Variance  Reductions  for  Finite 
Capacity  M/M/ 1 Queue;  r = P(X  = O) 


p 

P(X  = 0) 

2 

<^0 

r| 

^1 

^2 

.10 

.9000 

. .0075 

. 1000 
.3162 

.0100 

.1000 

.0010 

.0316 

.20 

.8000 

.0083 

.2000 

.4472 

.0400 

.2000 

.0080 

.0894 

.30 

.7000 

.0132 

.5000 

.5^+77 

.0900 

.3000 

.02J0 

.1643 

.LO 

.6000 

.3657 

.4000 

.6324 

.1599 

.3999 

.0639 

.2528 

OO 

.5000 

.6659 

.^995 

.7067 

.2492 

.4992 

.1242 

.3524 

.60 

.4002 

1.069 

.5961 

.7721 

.35i^l 

.3951 

.2093 

.4575 

.70 

.3014 

1.52 

.6830 

.8265 

.4627 

.6802 

.3101 

.5568 

.80 

.2075 

1.811 

.8657 

.5543 

.7445 

.4033 

.6350 

.90 

.1259 

1.619 

.7884 

.8879 

.6113 

.7818 

. 4645 
.6816 

.95 

.0932 

1.338 

.7975 

.8930 

.6249 

.790^ 

.4796 

.6925 

.99 

.0715 

1.076 

.8003 

.8946 

.6292 

.7932 

.4843 

.6959 

LI 
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(1*..2)  EXAMPLE.  The  Repairman  Problem  with  Spares. 

We  next  consider  a repairman  problem  with  n operating  units  and 
m spares.  Each  of  the  operating  units  fails  with  rate  A.  Upon  failure 
the  unit  enters  a queue  to  obtain  service  from  one  of  s repairmen.  The 
failed  unit  is  replaced  by  a spare  (if  any  are  available).  The  distribu- 
tion of  the  i.i.d.  repair  times  is  exponential  with  mean  for  each 

repairman.  When  a unit  is  repaired  it  enters  the  pool  of  spares  unless 
there  are  fewer  than  n units  in  operation  in  which  case  the  repaired 
unit  immediately  becomes  operational.  If  we  let  X(  t)  denote  the  number 
of  units  in  service  and  in  the  queue  for  service  then  {X( t) , t > 0)  is 
a birth  and  death  process  with  parameters; 

nA,  0 < i < m 

(n+m-i)A,  m < i < n+m 

i|a,  ^ ^ i .5  ® 

S|^^  s < i < m+n. 

The  process  can  be  modelled  by  the  simple  closed  queueing  network  pictured 
in  Figure  1. 

2 

For  our  functions  f we  choose  f( i)  = i and  f( i)  = i so 
2 

that  r = E(X)  and  E(X  ) respectively.  In  our  example  we  let  m = k, 
n •-  10  and  A = 1.  The  parameters  s and  n are  then  allowed  to  vary. 
Again  the  state  space  is  E = (0,  1^  ...^ 


k2 


Ik).  It  should  be  emphasized  that 


FIGURE  1 


m spares  n operating  queue  s 

units  repairmen 


the  figures  in  Tables  5 ^ ^re  actual  calculations  of  variance  reductions 

and  not  estimates  derived  from  simulations.  The  tables  are  grouped  accord- 
ing to  the  value  s^^  which  is  the  maximum  service  rate  for  the  repair 
facility. 

Notice  that  for  both  functions  and  all  combinations  of  parameter 
values  substantial  variance  reductions  are  obtained.  The  variance  reduc- 
tions are  approximately  the  same  for  all  sets  of  parameters^  a situation 
not  found  in  the  truncated  M/M/ 1 queue.  Note^  however^  that  by  setting 
s = n = 1 and  m = M-1  we  obtain  the  same  birth  and  death  parameters  as 
in  the  M/m/ 1 queue  with  capacity  M.  The  appearance  of  uniformity  in 
variance  reductions  is  therefore  due  to  the  fact  that  in  the  range  of 
parameter  values  chosen  the  process  exhibits  a more  "stable"  type  of 
behavior  than  the  M/m/ 1 queue  does. 


TABLE  5 


Calculated  Variance  Reductions 
Repairman  Problem:  r = E(X) 

for 

(s,n) 

E(X) 

2 

'^0 

4 

^2 

"3 

1,12 

2.751 

149.6 

.0596 

.1989 

.0110 
. 1049 

.OC58 

.0764 

2,6 

3.078 

130. 1 

.0518 

.1783 

.0086 

.0928 

.0071 

.0843 

3.i^7l 

110.4 

.0575 

.1936 

.0106 

.1029 

.0105 

.1023 

^3 

5.890 

95.5 

.0428 

.2069 

.0171 

.1309 

.0171 

.1307 

1,9 

4.842 

267.9 

.0835 

.2885 

.0563 

.2575 

.0321 

.1791 

2,J+.5 

5.025 

228.0 

.0719 

.2681 

.0501 

.2239 

.0525 

.1797 

2,3 

5.255 

186.9 

.0640 

.2529 

.0454 

.2083 

.0282 
. 1678 

^2.25 

5.510 

150,6 

.0590 

.2428 

.0390 

.1975 

.0287 

.1694 

1,6 

7.950 

155.9 

. 1766 
.4205 

.0643 

.2557 

.0284 

. 1685 

2,3 

7.951 

148.0 

.1634 

.4043 

.0562 

.2371 

.0243 

.1557 

3,2 

7.984 

137.7 

.1475 

.5841 

.0470 

.2167 

.0195 

.1596 

i^,1.5 

8.050 

125.6 

. 1308 
.5617 

.0393 

.1982 

.0184 

.1356 

1,3 

10.999 

32.89 

.2171 

.4659 

.0682 

.2611 

.0223 

.1491+ 

2,1.5 

11.000 

32.83 

.2169 

.4658 

.0680 

.2608 

.0222 

.1489 

3,1 

11.000 

32.86 

.2166 

.4654 

.0677 

.2603 

.0220 

.1482 

^,.75 

11.000 

32.31 

.2160 

.4648 

.0672 

.2592 

.0216 
. 1468 
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TABLE  6 


Calculated  Variance 

Repairman  Problem: 

Reductions 
r = EfX^) 

for 

(s,^) 

E(X^) 

2 

'^O 

^2 

^2 

’'5 

1,12 

13.46 

9,610 

.0836 

.2892 

.0156 

.1251 

.0067 

.0819 

2,6 

15.06 

9,377 

.0674 

.2595 

.0172 

.1515 

.0083 

.0911 

17.28 

9,009 

.0552 

.2306 

.0177 

.1329 

.0104 

.1018 

20.01 

8,568 

.0445 

.2109 

.0217 

.1473 

.0108 

.1039 

1,9 

31.66 

28,546 

. 1659 
.4075 

.0351 
. 1874 

.0134 

.1157 

2,4.5 

32.85 

26,287 

.1420 

.3768 

.0290 

.1702 

.0117 

.1085 

3,5 

34.51 

25,787 

.1160 

.3406 

.0251 
. 1521 

.0108 
. 1041 

4,2.25 

56.59 

21,153 

.0928 

.5046 

.0199 
. 1411 

.0113 
. 1061 

1,6 

69.25 

33, 9^+^^ 

.1897 

.4356 

.0629 

.2509 

.0248 

.1575 

2,5 

69.43 

33,120 

.1804 

.4247 

.0565 

.2375 

.0203 

.1443 

3,2 

69.74 

31,904 

. 1676 
.4094 

.0481 

.2195 

.0165 

.1285 

4,1.5 

70.21 

30,281 

.1525 

.3902 

.0597 

.1992 

.0129 

.1134 

1,5 

123.99 

14,456 

.2046 

.4523 

.0594 

.2437 

.0178 

.1334 

2,1.5 

123.99 

14,453 

.2045 

.4522 

.0594 

.2437 

.0177 
. 1332 

5,1 

124.00 

14,449 

.2044 

.4521 

.0592 

.2434 

.0176 
. 1328 

4,.  75 

124.00 

14,437 

.2040 

.4517 

.0589 

.2427 

.0174 
. 1520 

A. 


ueue. 


(14..3)  EXAMPLE.  Waiting  Time  Process  in  an  M/>l/ 1 Q 

We  now  turn  to  an  example  of  a regenerative  Markov  process  with  an 

uncountable  state  space  E = [O.co),  We  let  W and  S be  the  waiting 

' n n 

time  and  service  time  respectively  of  the  nth  customer  in  a single  server 

queue.  Let  the  time  between  the  arrival  of  the  nth  and  (n+l)st 

customer.  We  assume  that  ^ 0)  i.i.d.  with  ^ ^ and 

that  n > 1)  are  i.i.d.  with  E(A^)  = Let  the  traffic  intensity 

p be  defined  by  p = We  assume  that  customer  number  0 arrives  at 

time  0 to  an  empty  system.  Let  X = S , - A for  n > 1.  The  waiting 

n n- 1 n — 

time  process  ^ ^ defined  recursively  by 


(4.1^) 


w 


0 


0 


w 

n 


1 


n > 1. 


It  is  known  that  if  p < 1 there  exists  an  infinite  number  of 

indices  n such  that  W = 0 (and  the  expected  time  between  any  two 

n 

consecutive  indices  is  finite) . Thus  we  choose  x^  = 0 as  our  return 

state  and  regenerations  occur  whenever  a customer  arrives  to  an  empty 

queue.  For  p < 1 we  therefore  know  that  there  exists  a random  variable 

W such  that  ^ W.  For  more  details  on  this  queue^  which  is  commonly 

called  the  GI/G/1  queue^  see  Iglehart  (I97I).  We  shall  be  interested  in 

2 

estimating  E(W)^  which  is  finite  if  The  appropriate  function 

y 

f is  then  f(x)  = x.  in  order  to  calculate  f^  = P f we  need  to  find 
the  transition  function  of  the  process.  We  illustrate  this  for  the  M/M/1 
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queue;  i.e.^  for  a queue  with  exponentially  distributed  service  times  with 
mean  4'^  and  Poisson  arrivals  v/ith  rate  A.  For  M/1V^  1 the  calculations 
are  straightforward.  The  approach  generalizes  easily  to  the  GI/G/I  queue^ 
although  one's  ability  to  carry  out  the  computations  in  practice  depends 
on  the  distributions  of  the  service  and  interarrival  times.  Numerical 
integration  techniques  may  be  of  use  here  although^  in  theory^  one  must 
calculate  P f(x)  for  all  values  of  x > 0. 

For  the  >5/1^1  queue  it  is  easy  to  show  that 


u Ay 
A-r4  ® 


for  y < 0 


(i^.5) 


P(x^  < y)  = 


1 - e‘^^  for  y > 0 . 

A+4  — 


Thus  g(y)  = 5 y)  exists  for  all  y and  we  write  F(X^  £ dy} 

= g(y)‘^y  where 


(4.6) 


^ g_  ( y) 

g(y)  =1 

isjy) 


2l±. 

A+4 


A+4 


y < 0 


y > 0 . 


Now  to  evaluate  fj^(x)  we  have 


^7 


f^(x)  = ; P(x,  dy)  f(y)  = ; yp{(w^  + X £ dy|w^  = x) 

LO^co)  [0,“) 

= / yP(x  + X £ dy}  = / yg(y-x)dy 

(0,00)  (0,co) 


= / yg  (y-x)dy  + f yg  (y-x)dy 
y=0  " y=x 


Evaluation  of  these  integrals  is  straightforward  and  we  find  that 


-Toe 


(1..7)  f^Cx)  = X e 


To  evaluate  ^2^^)  ®ust  compute  the  integral 


(+.3)  f2(x)  = / P(x,dy)  f (y) 

[0,00) 

= P(x,  (0})  f (0)  + / f (y)  g(y-x)dy 

(0,»)  ^ 

-x)  fj^(O)  + / fj^(y)  g(y-x)dy 
(O^oo) 


It  is  finally  found  that 


(i^.9) 


£,W  . £,(k) 


It  is  possible  in  this  case  to  calculate  exactly  the  covariance 
matrix  ^2  so  that  we  can  obtain  exact  results  for  the  variance  reductions. 
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The  calculacions  are  very  long  and  tedious  and  so  will  be  omitted.  The 

basic  approach  is  to  use  stationary  process  theory;  i.e.^  we  assume  that 

n > 0)  is  a stationary  process.  We  can  then  obtain  the  central 

limit  theorem  given  in  Proposition  (3.35).  The  covariance  matrix  V for 

this  central  limit  theorem  is  given  in  (3.56).  Since  the  central  limit 

theorem  given  in  Proposition  (2.21)  (with  its  covariance  matrix  ZV'E(Tj^)) 

is  still  valid  no  matter  what  the  initial  distribution  of  the  process 

is^  it  must  be  the  case  that  the  two  seemingly  different  matrices  are 

equal.  Therefore,  v..  = a../E(T.).  We  can  now  evaluate  the  terms  v.. 

^ ij  ij  1 ij 

by  performing  calculations  on 


(1^.10) 


R(z 


1^  ^2* 


s)  = L El 
n=0 


[-z.W  -E  W 

10  2 n n 

e J s , 


the  generating  function  of  the  joint  Laplace  transform  of  the  (stationary) 
waiting  time  process.  E(z^^  s)  has  been  calculated  by  Blomqvist 

(1967).  Table  7 gives  the  calculated  variance  reductions  in  the  II/h-'I 
queue  for  different  values  of  p.  Again  we  obtain  substantial  variance 
reductions  although  the  method  is  somewhat  less  effective  for  high  values 
of  p. 

To  determine  how  well  the  method  performs  in  actual  simulation 
two  values  of  p,  p = .5  and  p = .9^  were  selected  for  simulation.  The 
simulation  run  lengths  were  200^000  and  90,000  cycles  for  p = .5  and  .9 
respectively.  The  expected  number  of  customers  simulated  for  these  two 
runs  are  1+00,000  and  900,000.  The  random  number  generator  described  in 
Learmonth  and  Lewis  (1973)  was  used.  Statistical  properties  of  this 
generator  are  reported  in  Learmonth  and  Lewis  ( 197^) . 

1+9 


Each  run  was  broken  into  R independent  replications  of  C 
cycles/replication  for  several  different  values  of  R and  C (RC  = total 
number  of  cycles  simulated) . For  example  in  the  p = ,5  run  we  had 
R = 200,  100,  50,  snd  1 with  their  corresponding  values  of  C = 1000, 

2000,  kOOO,  and  200,000.  For  each  replication  we  formed  point  estimates 
for  the  various  parameters  of  interest.  The  figures  reported  in  Tables  8, 

9,  11,  and  12  are  then  the  sample  averages  of  the  point  estimates  taken 
over  the  R independent  replications.  Approximate  95'?^  confidence  intervals 
for  each  parameter  (formed  in  the  usual  manner  using  the  R i.i.d. 
replications)  are  given  directly  below  the  point  estimator.  For  example, 
in  Table  8,  for  R = 200  and  C = 1000  a 95^  confidence  interval  for 

/N 

r = E(W)  = 1000  based  on  the  estimator  is  (1.015  - .016^  I.OL5  + .OI3). 

In  these  tables  we  include  the  dependence  of  ^ on  k by  letting 
denote  the  vector  of  optimal  multipliers  when  combining  the  estimators 

^0^  ^k^ 


(4.11) 


/N  /N  /S 

= Z r 

P*  v=0 


where  ^|^(v)  is  our  point  estimate  for  p*(v). 

On  each  replication  90,  95  and  99-;(  confidence  intervals  were  formed 

✓S  /\  />  /N  /S 

using  each  of  the  point  estimators  r r^,  rp,  r^  and  r^  and  their 

appropriate  estimated  variance  terms.  The  fraction  of  those  confidence 
intervals  that  actually  contain  r are  given  in  Tables  10  and  I5.  If 
valid  confidence  intervals  are  being  formed,  these  fractions  ( ca’ led 
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The  relatively  low  coverages  for  shorter  runs  (small  C)  can  be  explained 

2 

by  observing  in  Tables  9 and  12  chat  is  underestimated.  This 

causes  the  confidence  intervals  to  be  too  short  thus  resulting  in  lowered 

2 

coverage.  The  main  reason  that  is  underestimated  is  due  to 

inaccurate  estimation  of  2,  (see  Tables  9 and  12), 

Because  of  this  behavior  the  simulator  must  take  care  that  the  run 
length  is  not  too  short.  By  having  too  short  a run  length  the  simulator 
may  be  placing  unjustified  confidence  in  his/her  estimates.  If  the  run 
length  is  adequate  however  the  method  is  capable  of  producing  tight  con- 
fidence intervals  with  good  coverage  properties.  The  techniques  developed 
by  Lavenberg  and  Sauer  ( 1977)  determining  run  lengths  are  applicable 
and  may  be  of  practical  value  in  this  area. 

In  order  to  use  this  method  we  must  evaluate  the  functions 
fQ(W^)  j ^2^'^n^  each  customer  n.  To  get  a measure  of 

the  computational  savings  ( if  any)  of  the  method  it  is  important  to  deter- 
mine how  much  extra  work  is  being  done  for  each  customer.  From  (l4.,10) 
and  (If.  12)  it  is  seen  that  to  evaluate  f^^  and  one  exponential  and 

several  multiplications  and  additions  must  be  computed  for  each  customer. 
For  p = .5  we  estimated  ( from  CPU  times  of  shorter  runs)  that  on  the 
average  each  customer  using  multiple  estimates  requires  5/3  ^s  much  CPU 
time  as  a run  using  no  variance  reduction  technique.  Since  we  need  only 
simulate  .0527  times  as  many  customers  to  get  confidence  intervals  of 
equal  length  (see  Table  7)  our  computational  savings^  which  is  defined 
as  the  ratio  of  CPU  times  needed  to  obtain  equally  accurate  estimates^  is 
.O5U5  (=  5/3  X .0527).  For  p = .9  the  work  per  customer  is  increased  by 
less  than  a factor  of  5/3*  This  is  because  for  a fixed  number  of 
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customers  we  must  evaluate  and  the  same  number  of  times  for  the 


different  values  of  p,  but  we  experience  fewer  cycles  with  the  higher 
value  of  p.  We  therefore  spend  less  time  updating  estimates  of  E.  Thus 
for  p = ,9  our  computational  savings  is  at  least  .5^7  (=  5/5  X .523). 

On  the  other  hand  if  we  fix  the  CPU  time  to  be  the  same  for  each  method 
what  is  the  statistical  savings  (defined  to  be  the  ratio  of  confidence 
interval  lengths  for  equal  run  times)?  Suppose  for  a specified  CPU  time 
we  can  simulate  customers  using  no  variance  reduction  technique 

(method  1)  and  customers  using  multiple  estimates  (method  2).  Since 

each  method  2 customer  requires  5/3  much  CPU  time  as  a method  1 customer 
we  must  have  = 5/5  The  ratio  of  the  lengths  of  the  confidence 


intervals  is  then 


mV2 


(5/3) 

^1/2  - a^O*) 


This  ratio  is  .25  and  .Jk  for  p = .5  ^nd  .9  respectively. 


1 
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TAELE  7 


Calculated  Variance  Reductions  for 
Process  in  an  M,  >!,  I Queue; 


the  Waiting  Time 
r = E(W) 


(u  = 1,  A = c,  E’W'  = c,  ..-X) 


p 

E(W) 

"0 

■^1 

*1 

'2 

. 10 

,1111 

. “ 

ulTO 
.0.  5 

.0001 

.0008 

.20 

.250 

1 . 

.0c;c5 

. 1623 

,0009 

.0306 

.50 

.li29 

5.'^t 

.0568 

.2383 

.0045 

.0672 

.LO 

.667 

10. 

.0968 

.3111 

.0137 

.1173 

.50 

1.00 

29.0 

.1457 

.5817 

.0527 

.1803 

.60 

1.50 

83.5 

.2029 

.4505 

. 0664 
.2577 

.70 

2.55 

556 

. 2678 
.5175 

. 1214 
.3^85 

.80 

4.00 

1,976 

.3397 

.5828 

.2055 

.4553 

.90 

9.00 

35,901 

.4175 

.6461 

.3280 

.57^7 

.95 

19.00 

607,601 

.4582 

.6769 

.4072 

.6381 

.99 

99.00 

5.96  X 10® 

.4904 

.7003 

. 4686 
.6845 
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TA3LE  3 


Simulation  Results  for  Waiting  Time  Process  in  M/M/ 1 Queue 
With  p = .5,  Point  Estimates  and  95^:  Confidence  Intervals 


R = 200 

R = 100 

II 

0 

R = 1 

True 

Parameter 

Value 

C = 1000 

C = 2000 

C = 4000 

C = 200,000 

/N 

1.000 

1.015 

1.016 

1.018 

1.018 

• 0 

.016 

.017 

.015 

/s 

"1 

1,000 

1.015 

1.014 

1.015 

1.015 

.015 

.014 

,012 

✓s 

1,000 

1.011 

1.011 

1.012 

1.015 

.011 

.011 

.010 

/s 

r 

1.000 

.999 

1.000 

1.005 

1.005 

.006 

.007 

.006 

/S 

r 

l.OCO 

.996 

.998 

.999 

1.001 

.005 

.005 

.005 

.U6 

. 106 

.116 

.125 

.155 

.005 

.007 

.007 

^1 

.582 

.521 

.337 

.352 

.367 

.003 

.010 

.011 

4 

.035 

.015 

.018 

,022 

.026 

.001 

.002 

.005 

.181 

.U5 

.129 

.144 

.160 

c. 

.005 

.007 

.008 

P|(0) 

-3.645 

-3.572 

-3.635 

-5.6^2 

-3.727 

.094 

. 106 

. 107 

Pt(l) 

4.645 

4.572 

4.655 

4.692 

4.727 

.094 

.106 

. 107 

^^(0) 

5.596 

4.587 

4.866 

5.125 

5.343 

.222 

.266 

.274 

3^(1) 

-16.700 

- 14 , 648 

-15.565 

-16.024 

- 16.585 

.575 

.636 

.710 

3^(2) 

12.504 

11.061 

11.493 

11.901 

12.24.2 

.551 

.U21 

.456 

TABLE  9 


Point  Estimates  and  95'i  Confidence  Intervals  for  Variances 
and  Covariances,  M/M/ 1 Simulation,  p = .5 


Parameter 

True 

Value 

R = 200 

C = 1000 

R = 100 

C = 2000 

R = 50 

C = 4000 

R = 1 

C = 200,000 

2 

'^0 

29.0 

30.23 

2.53 

30.56 

2.67 

30.83 

2.34 

30.98 

^01 

23.67 

24.67 

2.21 

24.95 

2.34 

25.18 

2.05 

25.31 

^02 

19.)+8 

20.30 

1.93 

20.53 

2.04 

20.73 

1.80 

20.83 

2 

°1 

20.30 

1.94 

20.53 

2.05 

20.73 

1.81 

20.84 

‘^12 

16. 14 

16.79 

1.69 

16.99 

1.79 

17. 16 
1.59 

17.25 

4 

13.44 

13.95 

1.48 

14.12 

1.56 

14.27 

1.39 

14.34 

4.23 

3.46 

.40 

3.74 

.46 

3.98 

.46 

4.13 

.948 

.505 

.079 

.603 
. 104 

.699 

.119 

.792 

56 


TABLE  lO 


Probability  of  Coverage  for  90,  95  and  99^0  Confide. '.ce 
Intervals  for  E(W)  in  M/M/1  Queue,  p = .5 


Estimator 

Type  of 
C.I. 

R = 200 

C = 1000 

R = 100 

C = 2000 

R = 50 

C = 4000 

R = 1 

C = 200,000 

/S 

.90 

.89 

.92 

.92 

0.00 

.95 

.95 

.96 

.98 

0.00 

.99 

.99 

.98 

1.00 

1.00 

.90 

.89 

.90 

.92 

0.00 

.95 

.95 

.9- 

.96 

0.00 

.99 

.93 

.98 

1.00 

1.00 

.90 

•89 

.89 

.94 

0.00 

"2 

.95 

.9^ 

.9'+ 

.96 

0.00 

.99 

.98 

.98 

.98 

1.00 

.90 

.81 

.84 

.92 

1.00 

r 

.95 

.87 

.88 

.94 

1.00 

.99 

.93 

.95 

.96 

1.00 

.90 

.69 

.78 

.73 

1.00 

.95 

.79 

.90 

1.00 

.99 

.81; 

.86 

.93 

1.00 

f 


J 


TABLE  11 


! 


Simulation  Results  for  Waiting  Time  Process  in  M/M/ 1 Queue 
With  P = .9,  Point  Estimates  and  95^  Confidence  Intervals 


True 

R = 60 

R = 30 

R = 15 

R = 1 

Parameter 

Value 

c = 1500 

c = 3000 

c = 6000 

c = 90,000 

9.000 

8.841 

8.863 

8.891 

8.900 

u 

.283 

.210 

/N 

r. 

9.000 

8.842 

8.864 

8.892 

8.901 

.281 

.271 

.209 

9.000 

8.843 

8.865 

8.892 

8.901 

.^9 

.270 

.207 

/S 

r 

9.000 

8.875 

8.929 

8.988 

8.950 

PI 

.217 

.225 

.278 

r 

9.000 

8.827 

8.899 

8.946 

8.908 

% 

. 187 

.210 

.253 

ro 

.4I7 

.312 

.339 

.355 

.584 

.017 

.025 

.021 

.646 

.555 

.579 

.595 

.620 

i 

.016 

.020 

.018 

.328 

.218 

.245 

.263 

.296 

.017 

.022 

.021 

R2 

.573 

.461 

.491 

.511 

.544 

.018 

.023 

.020 

P^(0) 

-105. 1 

-89.9 

7.2 

-93.5 

8.3 

-95.7 

8.9 

-96.5 

106.1 

90.9 

9^.3 

98.7 

97.5 

7.2 

8.3 

8.9 

P$(o) 

462 

00  0 

392 

43 

401 

43 

408 

P^(l) 

-1050 

-846 

-395 

-916 

-930 

84 

96 

95 

P|(2) 

589 

k77 

504 

516 

523 

46 

52 

55 

58 


r 


I 

I 

■;  TABLE  12 

t 

f Point  Estimates  and  95-,^  Confidence  Intervals  for  Variances  and 

I Covariances,  M/m/ 1 Simulation,  p = .9 


Parameter 

True 

Value 

R = 60 

c = 1500 

R = 50 

c = 3000 

R = 15 

c = 6000 

R = 1 

c = 90,000 

2 

^0 

55,901 

26,649 

7,498 

27,809 

8, 140 

28,656 

8,147 

28,601 

^01 

55,704 

26,474 

7,475 

27,651 

8,115 

28,454 

8,125 

28,420 

^02 

55,509 

26,502 

7,449 

27,454 

8,090 

28,275 

8,099 

23,242 

2 

55,509 

26,502 

7,449 

^,455 

8,091 

23,276 

8,099 

28,242 

^12 

55,51-5 

26,151 

7,425 

27,281 

8,066 

28,098 

8,076 

28,065 

2 

^2 

55,125 

25,982 

7,401 

27,107 

8,042 

27,925 

8,055 

27,890 

4' St) 

14,988 

8,142 

1,989 

9,454 

2,817 

10,548 

5,553 

10,977 

11,777 

5,745 

1,580 

6,917 

2, 112 

7,801 

2,905 

8,452 

59 


TABLE  15 


Probability  of  Coverage  fo 
Intervals  for  E(W) 

r 90,  95  and  99^0 
in  m/m/ 1 Queue^ 

Confidence 

P = .9 

Estimator 

Type  of 

C.I. 

R = 60 

c = 1500 

R = 50  R 

c = 3000  C 

= 15  R 

= 6000  C 

= 1 

= 90,000 

/s 

.90 

.88 

.93 

1.00 

1.00 

.95 

.92 

.93 

1.00 

1.00 

u 

.99 

.93 

.96 

1.00 

1.00 

/s 

.90 

.88 

.93 

1.00 

1.00 

*^1 

.95 

.92 

• 93 

1.00 

1.00 

1. 

.99 

.93 

.96 

1.00 

1.00 

/S 

.90 

.88 

• 93 

1.00 

1.00 

^2 

.95 

.92 

• 93 

1.00 

1.00 

.99 

.93 

• 96 

1.00 

1.00 

/V 

.90 

•75 

• 83 

.80 

1.00 

r 

.95 

.85 

.90 

•87 

1.00 

% 

.99 

.90 

1.00 

1.00 

1.00 

/s 

.90 

• 72 

•77 

.80 

1.00 

r 

.95 

.80 

• 87 

• 93 

1.00 

.99 

.89 

• 97 

1.00 

1.00 

(4.12)  EXAMPLE.  Waiting  Time  Prc?cess  in  an  Queue. 


We  now  turn  to  a stochastic  process  which  has  a much  more  complicated 

structure  than  any  of  the  previous  processes.  For  this  example  we  are 

interested  in  estimating  the  expected  stationary  waiting  time  in  a GI/g/c 

queue.  The  number  of  servers  in  the  queue^  c^  is  assumed  to  be  greater 

than  1.  To  apply  the  method  we  need  to  find  an  underlying  Markov  process. 

Following  Kiefer  and  Wolfowitz  (1955)  and  (1956)  we  let  W^^  be 

the  workload  of  the  server  with  the  ith  lightest  workload  just  prior 

to  the  arrival  of  the  nth  customer  and  let  v;  = (W  W ).  Again 

~n  ' nl^  ’ nc' 

let  S denote  the  service  time  of  the  nth  customer  and  A , denote  the 
n n+1 

time  between  the  arrival  of  the  nth  and  (n+l)st  customer.  Let  be 

a vector  with  c components  defined  by 


{‘+.15) 


X , = (S  - 
~n+l  n 


n+P  " n-i-l> 


A 

n+ 


A recursion  for  W can  be  defined  by  (assuming  the  system  is  initially 
~n 

empty) 

Wq  = •••> 

W , = F[(W  + X , n > 0 

~n+l  ''~n  ~n+l''  * — 

c c 

where  F is  the  function  from  ]R  to  B which  arranges  components  in 
increasing  order.  Notice  that  (4.4)  is  a spcial  case  of  (4.41)  when  c = 1. 
The  waiting  time  of  the  nth  customer  is  then  If  we  assume  that 
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n > 0}  are  i.i.d.  with  mean  and  distribution  function  G and 

that  {A  n > 1)  are  i.i.d.  with  mean  A ^ and  distribution  func- 
n'  — 

tion  it  is  easy  to  see  that  ^ is  a Markov  process 

with  state  space  E = (x  € :0  < < X2  < • • • < } . Let  p = A/(pc) . 

It  is  known  that  if  p < 1 and  if  PfA  > S } > 0 then  there  exists 

n+1  n 

an  infinite  number  of  indices  n such  that  W = (0.  ...  O) 
and  that  the  expected  time  between  such  indices  is  finite  (see  Whitt 
(1972)).  We  therefore  pick  ^ = (0,  . . . , O)  as  our  return  state.  From 
the  above  we  know  that  there  exists  a W=(W, W)  such  that 

fs*  '1'  ^ C' 

W^  W.  We  shall  be  interested  in  estimating  r = E(Wj^)^  the  expected 
stationary  waiting  time.  Let  f:E  be  defined  by  ...^  x^)  = Xj^, 

then  r = E(  f(  W)  ) . 

In  order  to  apply  the  method  it  is  necessary  to  calculate  p'^f 
for  V = 1^  •••>  Unfortunately  even  in  such  a simple  case  as  the 
M/>V'2  queue  calculation  of  the  transition  function  P is  quite  tedious 
(although  it  is  possible  to  evaluate).  For  this  reason  we  choose  k = 1 
since  evaluation  of  Pf  is  relatively  straightforward; 

(L.I5)  Pf(w)  = "2»  ''c^) 

C 

= / (w,+s-t)  dH(t)  dG(s)  + E / (w,-t)  dH(t)  dG(  s) 

Cl  >-2  0,  “ 

where  = {(s,t):  0 < Wj^+s- 1 < w^- 1 for  j > 2}  and 

C.  =((s,t):0<w,-t<w.-t  for  j>2  and  w -t  < w +s-t)  for  h > 2. 
h h—  j — h 1 — 

Notice  though  that  is  empty  for  h > 2 since  we  must  have  < w^ 
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I 

[ 


! 


I 


in  which  case  Wg-t  < (in  case  Wg  = assign  the  ambiguous  points 

to  Cg  rather  than  . Thus  for  all  values  of  c > 2 we  have 


(i+.l6)  PfCw)  = I (w, +s-t)  dH(  t)  dG(s)  + / (w  - t)  dH(  t)  dG(  s)  . 

n ^ ry  ^ 


Again  the  evaluation  of  these  integrals  is  straightforward  for  the  m/m/c 
queue.  In  this  case  we  find  that 


(i+.l7)  f^(w)  = + 


MA+U) 


-Awi 
e + 


^ A+u  ■ J 


For  this  process  the  covariance  matrix  E is  unknown  so  that 

actual  variance  reductions  cannot  be  calculated.  We  again  simulated  the 

process  for  p = .5  and  p = .9.  Our  run  lengths  were  100^000  cycles 

(500,000  customers)  for  p = .5  and  40,000  cycles  (760,000  customers) 

for  p = .9.  The  results  of  these  simulations  are  reported  in  Tables  14 

to  I7.  As  in  the  m/m/1  case  the  coverage  probabilities  generally 

increase  with  the  run  length.  For  a fixed  value  of  p the  variance 

reduction  obtained  is  less  for  the  m/m/2  queue  than  for  the  m/>V^  1 

queue.  This  is  probably  due  to  the  fact  that  the  underlying  stochastic 

process  for  m/m/2  is  quite  a bit  more  complicated  than  for  M/M/1, 

Notice  that  to  evaluate  the  function  fj^  we  must  compute  three 

exponentials.  Since  this  must  be  done  for  each  customer  we  are  substantially 

increasing  the  CPU  time  required  for  the  simulation.  In  fact  for  p = .9 

we  estimate  that  each  customer  requires  2.25  much  CPU  time  as 

2 

straightforward  simulation.  Since  (see  Table  I6)  is  greater  than  .5 
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it  must  be  concluded  that  in  this  case  the  method  is  not  computationally 
efficient.  By  this  we  mean  that  for  a fixed  amount  of  CPU  time  we  can 
get  more  accurate  estimates  by  not  using  the  variance  reduction  technique. 
For  this  reason  the  method  is  not  recommended  for  the  heavily  loaded 
GI/G/c  queue  ( c > 1)  unless  the  value  of  k can  be  increased  and  the 
functions  •••>  evaluated  cheaply. 


TABLE  li; 


Simulation  Results  for  Waiting  Time  Process  in  M/M/2  Queue  With 
p = .5,  Point  Estimates  and  95^  Confidence  Intervals 


Parameter 

True 

Value 

R = 100 

C = 1000 

R = 50 

C = 2000 

R = 25 

c = 4000 

R = 1 

C = 100,000 

.6667 

.6651 

.0212 

.6672 

.0181 

.6677 

.0182 

.6683 

/s 

.6667 

.666k 

.0184 

.6682 

.0159 

.6686 

.0159 

.6692 

/N 

r 

% 

.6667 

.6607 

.0101 

.6683 

.0100 

.6702 

.0108 

.6726 

.5029 

.0153 

.2983 

.0174 

.3052 

.0197 

.3207 

"1 

.5457 

.0141 

.5434 

.0156 

.5508 

.0171 

.5663 

2 

26.39 

4.83 

^7.00 

4.85 

2j.lk 

4.50 

27.35 

'^01 

22.90 

4.46 

23.44 

4.48 

23.56 

4.15 

25.75 

2 

'^2 

20. 10 

4.11 

20.57 

4.15 

20.67 

3.84 

20.85 

7.38 

1.35 

7.9^ 

1.58 

8.36 

1.71 

8.77 
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TABLE  15 


Probability  of  Coverage  for  90^  95  and  99fj  Confidence 
Intervals  for  E(W)  in  m/m/2  Queue^  p = .5 


Estimator 

Type  of 

C.I. 

R = 100 

C = 1000 

R = 50 

C = 2000 

R = 25 

C = 1+000 

R = 1 

C = 100 

.90 

.80 

.90 

.88 

1.00 

.95 

.85 

.92 

.96 

1.00 

.99 

.92 

.96 

.96 

1.00 

1 ^ 

.90 

.80 

.90 

.88 

1.00 

^1 

• 95 

.8L 

.92 

.96 

1.00 

.99 

.95 

.96 

.98 

1.00 

/N 

.90 

.81; 

.92 

.80 

1.00 

r 

.95 

.87 

.9^ 

.92 

1.00 

.99 

.96 

.96 

1.00 

1.00 

TABLE  16 


Simulation  Results  for  Waiting  Time  Process  in  M/2  Queue 
with  p = ,9^  Point  Estimates  and  95^  Confidence  Intervals 


R=52  R=16  R=8  R=1 

True 


Parameter 

Value 

c = 1250 

c = 2500 

c = 5000 

C = 400, ( 

"0 

8.526 

8.68k 

.481 

8.715 

.500 

8.728 

.408 

8.745 

yN 

"1 

8.526 

8.686 

.479 

8.714 

.498 

8.729 

.406 

8.746 

/S 

8.526 

8.772 

.511 

8.795 

.417 

8.904 

.556 

8.858 

.556 

.044 

.559 

.049 

.556 

.049 

.577 

• 727 
.050 

.745 

.035 

.744 

.035 

.760 

2 

52,975 

24,283 

55,206 

25,518 

57,671 

23,670 

57,720 

^01 

52,788 

24,232 

55,015 

25,465 

57,475 

25,627 

57,525 

2 

‘^2 

52,604 
24, 132 

54,827 

25,412 

57,278 

23,584 

57,528 

24,055 

9,  065 

28,660 

12,112 

51,928 

15,555 

55,523 
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TABLE  17 

Probability  of  Coverage  for  90^  95  and  99^  Confidence 
Intervals  for  E(W)  in  m/m/2  Queue^  p = .9 


Estimator 

Type  of 

C.I. 

R = 32 

c = 1250 

R = 15 

c = 2500 

R = 8 

c = 5000 

R = 1 

C = 40,000 

/N 

.90 

.8k 

.94 

1.00 

1.00 

^0 

.95 

.88 

.94 

1.00 

1.00 

.99 

.91 

.94 

1.00 

1.00 

.90 

.8L 

.94 

1.00 

1.00 

.95 

.88 

.94 

1.00 

1.00 

• 99 

•91 

.94 

1.00 

1.00 

.90 

.8L 

.88 

.88 

1.00 

r 

.95 

.84 

.94 

1.00 

1.00 

.99 

.91 

.94 

1.00 

1.00 

In  this  paper  a new  variance  reduction  technique  for  a wide  class 
of  stochastic  processes  is  proposed  and  tested.  The  method  differs  from 
most  other  control  variable  methods  in  that  the  means  of  the  control 
variables  do  not  need  to  be  known  explicitly.  The  method  is  capable  of 
producing  substantial  variance  reductions.  Because  the  method  requires 
additional  computations  to  be  done  both  before  and  during  the  simulation^ 
care  must  be  taken  so  that  the  method  is  used  only  when  it  is  computationally 
advantageous  to  do  so^  that  is  it  should  only  be  used  when  for  a fixed 
amount  of  computer  time  more  accurate  estimates  can  be  obtained  by  using 
the  method  than  by  not  using  it.  In  the  case  of  Markov  chains  it  is 
likely  that  the  method  will  be  most  effective  when  the  transition  matrix 
of  the  process  is  sparse^  in  which  case  the  preliminary  calculations  can 
be  carried  out  with  relative  ease.  It  is  for  this  type  of  process  that 
the  method  is  recommended. 
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